06. Denormalization in Apache Cassandra

05 Denomorlization In Apache -

Note of correction:
At 2:55 of the video, the instructor says "Losing customers to outages or low latency is not [inexpensive]."
She should have said "Losing customers to outages or poor performance is not [inexpensive]."

Data Modeling in Apache Cassandra:

  • Denormalization is not just okay -- it's a must
  • Denormalization must be done for fast reads
  • Apache Cassandra has been optimized for fast writes
  • ALWAYS think Queries first
  • One table per query is a great strategy
  • Apache Cassandra does not allow for JOINs between tables

Commonly Asked Questions:

  • I see certain downsides of this approach, since in a production application, requirements change quickly and I may need to improve my queries later. Isn't that a downside of Apache Cassandra?

    In Apache Cassandra, you want to model your data to your queries, and if your business need calls for quickly changing requirements, you need to create a new table to process the data. That is a requirement of Apache Cassandra. If your business needs calls for ad-hoc queries, these are not a strength of Apache Cassandra. However keep in mind that it is easy to create a new table that will fit your new query.

Additional Resource:

Here is a reference to the DataStax documents on [Apache Cassandra].(https://docs.datastax.com/en/dse/6.7/cql/cql/ddl/dataModelingApproach.html)

True or False:
Apache Cassandra denormalization of tables in data modeling is required.

SOLUTION: True

True or False:
When doing data modeling in Apache Cassandra 1 table per 1 query is a very acceptable practice.

SOLUTION: True

True or False:
When doing data modeling in Apache Cassandra knowing your queries first and modeling to those queries is essential.

SOLUTION: True

Understanding the answers to the above quiz questions is KEY. Once you shift your thinking to this model, all the rest is easy!